Phonetic events from the labeling the european portuguese database for speech synthesis, FEUP/IPBDB

نویسندگان

  • João Paulo Ramos Teixeira
  • Diamantino Freitas
  • Daniela Braga
  • Maria João Barros
  • Vagner Latsch
چکیده

In this paper a labeled new speech signal database (FEUP/IPBDB) in Standard European Portuguese (hereafter SEP) is presented. The objective of this work is, on one hand, to provide phonetic material for Text-to-Speech (TTS) systems construction, either from the start or to improve the quality of existing ones, and, on the other hand, to place at service of the SEP scientific community a phonetically and prosodically valuable speech corpus, essential for Speech Synthesis or Phonetics research. Our purpose is to make it available for the scientific community, since there isn’t any other DB of its kind for EP. The main features of the database will be described as well as some basic statistical aspects. A discussion of some methodological problems and some observed phenomena in experimental phonetics deriving from the speech signal labeling is also done. The approach in our work is to produce a resource that can be further improved in subsequent steps with minimal re-work. The phonetic, linguistic and technical consistency are guaranteed through the involvement of a multidisciplinary team.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HESITA(te) in Portuguese

Hesitations, so-called disfluencies, are a characteristic of spontaneous speech, playing a primary role in its structure, reflecting aspects of the language production and the management of inter-communication. In this paper we intend to present a database of hesitations in European Portuguese speech HESITA as a relevant base of work to study a variety of speech phenomena. Patterns of hesitatio...

متن کامل

Phonetically Transcribed Speech Corpus Designed for Context Based European Portuguese TTS

This paper presents a speech corpus for European Portuguese (EP), designed for context based text-to-speech (TTS) synthesis systems. The speech corpus is intended for small footprint engines and is composed by one sentence dedicated to each sequence of two phonemes of the language, incorporating as many language contexts as possible at diphone and word levels. The speech corpus is presented in ...

متن کامل

On the Identification of Word-Boundaries using Phonological Rules for Speech Recognition and Labeling

In this paper we studied the phonemic structure of the words’ beginnings and endings in standard European Portuguese (hereafter EP). The generativist description of the Portuguese phonology [1] was used as framework basis and the phonetic and acoustic experiments performed by Delgado-Martins [2] served as a model to the phonetic background in EP. We also compared the results between the expecte...

متن کامل

Automatic Phonetic Segmentation and Labelling of Spontaneous Speech

In this paper a tool for automatic segmentation and labeling of spontaneous speech is presented. It is developed and specially tuned for the European Portuguese (EP) language but simple changes are needed to convert it to other languages. The main purpose of this system is to quickly produce a high quality output of phonetic labels and related time boundaries using as input the speech signal on...

متن کامل

Unit Selection Speech Synthesis Using Phonetic-Prosodic Description of Speech Databases

This paper describes an approach to speech synthesis based on using speech databases at different stages of TTS process. Speech database units are phones in different segmental and prosodic contexts. Pitch synchronous segmentation and labeling of databases allows storing both segmental and prosodic information. Phonetic-prosodic annotations of speech databases are involved in off-line training ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001